Local Vector-based Models for Sense Discrimination
نویسندگان
چکیده
Word sense discrimination aims at automatically determining which instances of an ambiguous word share the same sense. A fully unsupervised technique based on a high dimensional vector representation of word senses was proposed by Schütze [10]. While this model was assumed to be Gaussian, results were only reported for the K-means approximation. In this work, a local vector-based model of reduced dimensionality which is linguistically coherent and can be computed for multivariate Gaussian mixtures is proposed. Several practical experiments are conducted on the New York Times News 1997 corpus. They show the advantages of unrestricted Gaussian models compared to K-means. The correct discrimination rate is further increased when using regularized Gaussian models as proposed in [2].
منابع مشابه
One Representation per Word - Does it make Sense for Composition?
In this paper, we investigate whether an a priori disambiguation of word senses is strictly necessary or whether the meaning of a word in context can be disambiguated through composition alone. We evaluate the performance of off-the-shelf singlevector and multi-sense vector models on a benchmark phrase similarity task and a novel task for word-sense discrimination. We find that single-sense vec...
متن کاملUsing Kullback-Leibler distance for performance evaluation of search designs
This paper considers the search problem, introduced by Srivastava cite{Sr}. This is a model discrimination problem. In the context of search linear models, discrimination ability of search designs has been studied by several researchers. Some criteria have been developed to measure this capability, however, they are restricted in a sense of being able to work for searching only one possibl...
متن کاملWord Sense Disambiguation Using Vectors of Co-occurrence Information
This paper reports on the word sense disambiguation of Korean noun by using co-occurrence information in context. For a given noun, its local contextual word distribution is not enough to express their semantic characteristics for noun sense disambiguation. This paper proposes a cluster-based sense as a base vector. Contextual noise is removed by a term weighting method, and hypernyms of remain...
متن کاملEfficiency Analysis Based on Separating Hyperplanes for Improving Discrimination among DMUs
Data envelopment analysis (DEA) is a non-parametric method for evaluating the relative technical efficiency for each member of a set of peer decision making units (DMUs) with multiple inputs and multiple outputs. The original DEA models use positive input and output variables that are measured on a ratio scale, but these models do not apply to the variables in which interval scale data can appe...
متن کاملWord Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces
This paper systematically compares unsupervised word sense discrimination techniques that cluster instances of a target word that occur in raw text using both vector and similarity spaces. The context of each instance is represented as a vector in a high dimensional feature space. Discrimination is achieved by clustering these context vectors directly in vector space and also by finding pairwis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005